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Abstract 

Annotating and interpreting the results of genome-wide association studies (GWAS) remains challenging. Assigning 
function to genetic variants as expression quantitative trait loci is an expanding and useful approach, but focuses 
exclusively on mRNA rather than protein levels. Many variants remain without annotation. To address this problem, we 
measured the steady state abundance of 441 human signaling and transcription factor proteins from 68 Yoruba HapMap 
lymphoblastoid cell lines to identify novel relationships between inter-individual protein levels, genetic variants, and 
sensitivity to chemotherapeutic agents. Proteins were measured using micro-western and reverse phase protein arrays from 
three independent cell line thaws to permit mixed effect modeling of protein biological replicates. We observed enrichment 
of protein quantitative trait loci (pQTLs) for cellular sensitivity to two commonly used chemotherapeutics: cisplatin and 
paclitaxel. We functionally validated the target protein of a genome-wide significant trans-pQTL for its relevance in 
paclitaxel-induced apoptosis. GWAS overlap results of drug-induced apoptosis and cytotoxicity for paclitaxel and cisplatin 
revealed unique SNPs associated with the pharmacologic traits (at p<0.001). Interestingly, GWAS SNPs from various regions 
of the genome implicated the same target protein (p<0.0001) that correlated with drug induced cytotoxicity or apoptosis 
(p<0.05). Two genes were functionally validated for association with drug response using siRNA: SMC1A with cisplatin 
response and ZNF569 with paclitaxel response. This work allows pharmacogenomic discovery to progress from the 
transcriptome to the proteome and offers potential for identification of new therapeutic targets. This approach, linking 
targeted proteomic data to variation in pharmacologic response, can be generalized to other studies evaluating genotype- 
phenotype relationships and provide insight into chemotherapeutic mechanisms. 
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Introduction 

Pharmacogenomics aims to identity clinically actionable mark- 
ers associated with response or toxicity; for oncology, evaluating 
genotype-phenotype relationships is particularly important be- 
cause non-response and adverse events associated with chemo- 
therapy can be life-threatening. Drug response and toxicity are 
thought to be multi-genic traits requiring whole genome studies to 
capture the most relevant variants. To complement clinical data 
and enhance discovery of genetic variants associated with 
sensitivity to drugs using a whole genome approach, we and 
others (reviewed by Wheeler and Dolan [1]) have developed 



cell-based models using International HapMap lymphoblastoid 
cell lines (LCLs). The genetic and expression environment for 
these cells has been well characterized thus allowing for genome- 
wide association studies (GWAS) and functional follow-up studies. 
Genetic variants associated with a given chemotherapeutic 
discovered in the LCL pharmacogenomic model have been 
replicated in clinical trials, arguably the most relevant system for 
biomedical science [2,3,4,5,6]. 

In addition to their value in pharmacogenomics discovery 
[7,8,9,10,1 1], LCLs have had broad utility as a discovery tool for 
genetic markers associated with many functional phenotypes, 
including: gene expression [12,13,14,15,16]; modified cytosines 
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Author Summary 

The central dogma of biology explains that DNA is 
transcribed to mRNA that is further translated into protein. 
Many genome-wide studies have implicated genetic 
variation that influences gene expression and that 
ultimately affect downstream complex traits including 
response to drugs. However, because of technical limita- 
tions, few studies have evaluated the contribution of 
genetic variation on protein expression and ensuing 
effects on downstream phenotypes. To overcome this 
challenge, we used a novel technology to simultaneously 
measure the baseline expression of 441 proteins in 
lymphoblastoid cell lines and compared them with 
publicly available genetic data. To further illustrate the 
utility of this approach, we compared protein-level 
measurements with chemotherapeutic induced apoptosis 
and cell-growth inhibition data. This study demonstrates 
the importance of using protein information to understand 
the functional consequences of genetic variants identified 
in genome-wide association studies. This protein data set 
will also have broad utility for understanding the relation- 
ship between other genome-wide studies of complex 
traits. 



[17]; variation in mRNA decay rates across individuals [18]; 
DNase hypersensitivity [19]; and baseline micro RNA levels [20]. 
In addition, the LCL model has been used to identify genetic 
markers of inflammatory cell death [21], bipolar disorder [22], 
and response to serotonin reuptake inhibitors [23,24]. Therefore, 
incorporating protein expression information into an existing 
dataset of genetic, epigenetic, mRNA expression, and drug 
sensitivity has the potential to identify novel candidates and 
mechanisms relevant to pharmacologic traits. 

Previously, we reported that SNPs associated with inter- 
individual variation in cytotoxicity of chemotherapeutic agents in 
LCLs are enriched in expression quantitative trait loci (eQTLs) 
and separately, enrichment was observed for eQTLs associated 
with ten or more target genes [25] . SNPs that overlapped between 
preclinical LCL studies and outcomes of patients treated with the 
same drug were also enriched in eQTLs [2]. An implicit 
assumption in these analyses and studies of other complex traits 
is that mRNA transcript abundances are a suitable proxy 
measurement for their corresponding protein levels. However, 
recent data has demonstrated poor overall correlations between 
mRNA and protein expression [26,27,28,29,30]. 

To investigate the role of genomics in protein expression and 
the role protein expression plays in altering pharmacologic 
responses, we employed the micro-western array (MWA) [31], a 
method that is approximately 1 000-fold more sensitive and has an 
~ 1 00-fold greater dynamic range than standard mass spectrom- 
etry methods and requires ~200-fold less sample and antibody 
than standard immunoblotting methods [32,33]. After screening 
4,366 previously unvalidated antibodies targeting 1,848 transcrip- 
tion factors (TFs) and 200 well-validated antibodies targeting cell 
signaling proteins, we used MWAs and reverse phase protein 
arrays (RPPAs) to collect protein data regarding 441 protein 
isoforms from 68 HapMap Yoruba (YRI) LCLs. Baseline protein 
levels were evaluated for their correlations with cellular sensitivity 
to cisplatin and paclitaxel, two of the most widely-used and 
successful chemotherapeutics worldwide that are mechanistically 
distinct [34,35,36]. The measurement of proteins in HapMap 
LCLs is of great value to complement the extensive publicly 
available genetic information already available on these cell lines. 



Although LCLs are not tumor cells, upon transformation they are 
likely to have changes in pathways that control cell cycle and cell 
proliferation, which are relevant pathways for anti-cancer drugs. 
Furthermore, we identified genetic variants associated with 
chemotherapeutic sensitivity that acted through their effect on 
protein levels. We observed an enrichment of pQTLs in genome 
variants associated with pharmacologic phenotypes. We combined 
this information to identify proteins relevant for pharmacologic 
phenotypes through multiple independent SNPs throughout the 
genome. 

Results 

Biological replicates enable robustness in measurement 
of protein levels 

Prior to our global analysis, a pilot study consisting of three 
independent biological replicates of six cell lines demonstrated 
significant variation not only among protein levels from different 
individuals, but also among cells thawed and propagated 
independendy from the same individual. Based on a significant 
thaw effect explaining 3.75% of global protein expression variation 
(p = 0.01, -Ftest), we measured baseline, steady-state protein levels 
from three independent thaws (thawed simultaneously) from each 
of 68 unrelated YRI LCLs to have a more accurate estimate of 
inter-individual variation in protein expression. These measure- 
ments were evaluated with both fixed effect (by averaging the three 
thaws) and mixed effect (by incorporating a random thaw effect 
per individual) models. Mixed effect modeling (MEM) allowed us 
to gain additional power from multiple measurements compared 
with simply averaging across the biological replicates in a linear 
model (Figure la). Relationships identified by fixed effect that 
had conflicting trends (i.e. positive and negative associations) 
across biological replicates were more likely to be false 
positives (Figure lb) than the observations that were repro- 
ducible by MEM (across biological replicates) (Figure lc); we 
therefore considered the MEM to be the more robust approach 
and used this method for all subsequent estimates of protein-drug 
associations. 

Relationship between drug phenotypes with protein 
levels 

Cell growth inhibition and caspase 3/7 activation were 
measured following treatment of 68 unrelated YRI LCLs with 
cisplatin (5 |J.M) or paclitaxel (12.5 nM). Notably, the correlation 
between cytotoxicity and apoptosis was greater for paclitaxel 
(r^ = 0.35) than cisplatin (r^ = 0.04), indicating that apoptotic cell 
death was a larger contributor to paclitaxel-mediated cell growth 
inhibition compared with cisplatin (Figure SI). We also assessed 
the effect of date of cell thaw on cellular phenotypes and found a 
significant correlation across two independent thaws (Figure S2; 
p<0.0001 and r 2 >0.28 for cytotoxicity, p<0.003 and r 2 >0.38 for 
apoptosis). 

From a starting pool of 4,366 antibodies, 198 antibodies 
producing a single predominant signal at the predicted molecular 
weight were carried forward for population-level quantification 
with the RPPA approach and 243 antibodies that displayed at least 
one band the size of the targeted protein isoform of interest with a 
signal-to-noise ratio &3 (but additional bands) were selected for 
subsequent population-level quantification by MWAs. We quan- 
tified the expression of 441 proteins across the same set of 68 
individual LCLs for which we measured responses to chemother- 
apeutic agents. At an FDR of 20%, 64 proteins were associated 
with one or more of the four drug phenotypes. At p<0.05, 52 and 
60 protein levels were associated with paclitaxel-induced apoptosis 
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Figure 1. Protein levels regressed against four cytotoxicity phenotypes using fixed effect or mixed effect models representing the 
three biological replicates. We analyzed 441 protein levels against 5 uM cisplatin induced apoptosis and cytotoxicity and 12.5 nM paclitaxel 
apoptosis and cytotoxicity using both fixed effect and mixed effect modeling (a). The Y-axis represents the total number of protein-drug phenotype 
(A, apoptosis and C, cytotoxicity) correlations (p<0.05) using fixed effect (medium grey) or mixed effect (light grey) or those that showed a 
correlation for both methods (dark grey). Five micromolar cisplatin induced caspase activity correlated with WHSC1 protein levels demonstrates 
strong association (p = 0.009) using the fixed effect, whereas the individual thaw association reveals no association from the third thaw, resulting in a 
greater than p>0.05 MEM result (b). Five micromolar cisplatin-induced caspase activity correlated with STAT3A (—90 kDa) protein levels across three 
thaws ranging had p<0.05 ranging from 0.02 to 1.6x10 -6 and a mixed effect p-value of 1.55x10~ 7 (c). 
doi:1 0.1 371 /journal.pgen.1 0041 92.g001 



and cytotoxicity, respectively, and 47 and 39 proteins were 
associated with cisplatin-induced apoptosis and cytotoxicity, 
respectively. Table S2 details these nominal associations for each 
phenotype and Table 1 highlights the top three associations for 
each phenotype. We compared the overlap between the two drugs 
and identified four proteins that were unique to the apoptotic 
pathway including CDKN2B, PDK1, TFB1M and ZNF132. 
EP300 was the only protein exclusively associated with cytotoxicity 
for both drugs. This observation implies that loss of cell viability in 
response to these two drugs occurs through distinct mechanisms. 

Using hierarchical clustering of the drug-protein effect sizes, 
seven significant clusters were defined by permutation analysis 
(p<0.001) (Figure 2a). We were unable to identify any significandy 
enriched pathways due to the limited and biased background set of 
proteins evaluated; however, we did observe proteins of similar 
function within the clusters. Protein levels in cluster one (Figure 2b) 



were associated with increased resistance to both drugs when 
measured for either phenotype. Proteins in this cluster included 
many known metabolism-regulating proteins, DNA damage 
response factors, proteins associated with innate immune response, 
and transcription factors associated with various stages of 
developmental biology. Metabolism-regulating proteins included 
mTor, p70S6K(T421/S424), Gabl(Y627), GSK3beta, and ONE- 
CUT2. DNA damage-related proteins in cluster one included 
apoptosis antagonizing transcription factor (AATF) and structural 
maintenance of chromosomes protein 1A (SMC1A). Proteins with 
known associations to immune response included several ubiquitin 
ligases such as TRIM13 and TRIM26. 

Protein levels in cluster 3 (Figure 2c) were associated with 
increased cellular sensitivity to both cisplatin and paclitaxel 
phenotypes and included many proteins related to calcium 
signaling: phospholipase C gamma 2 (PLCG2), c-Src (SRC) and 
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Table 1. Top proteins associated with cisplatin and paclitaxel phenotypes using a mixed effect model. 



Protein 


Gene Name 


Phenotype 


MEM p-value 


Beta 


p-S6.ribosomal. protein. (S240/244) 


RPS6 


Cisplatin Cytotoxicity 


5.10E-04 


1.08 


NCKAP1 L.75 kDa 


NCKAP1L 


Cisplatin Cytotoxicity 


1.16E-03 


-1.36 


ZNF497 


ZNF497 


Cisplatin Cytotoxicity 


2.56E-03 


-0.84 


STAT3A (-90 kDa) 


STAT3 


Cisplatin Apoptosis 


1 .46E-07 


-0.78 


STAT3B (-80 kDa) 


STAT3 


Cisplatin Apoptosis 


1.12E-04 


-0.59 


ENOI 


ENOI 


Cisplatin Apoptosis 


3.16E-04 


-0.48 


STAT3A (-90 kDa) 


STAT3 


Paclitaxel Cytotoxicity 


1.13E-05 


1.23 


GTF2F2 


GTF2F2 


Paclitaxel Cytotoxicity 


3.10E-04 


-1.15 


ZNF266.75-100 


ZNF266 


Paclitaxel Cytotoxicity 


3.12E-04 


-0.89 


ENOI 


ENOI 


Paclitaxel Apoptosis 


1 .09E-05 


-0.47 


STAT3A (-90 kDa) 


STAT3 


Paclitaxel Apoptosis 


1 .06E-03 


-0.42 


ZNF132 


ZNF132 


Paclitaxel Apoptosis 


1.99E-03 


-0.42 



doi:1 0.1 371 /journal.pgen.1 0041 92.t001 



focal adhesion kinase (FAK). Other proteins in cluster three 
included the tumor suppressor pl5ink4b (CDKN2B), estrogen 
receptor beta (ESR2), beta actin (BACT), alpha tubulin (TUBA), 
and several transcription factors including c-MYC (MYC), 
Hairless homolog (HR), H6 family homeobox 1(HMX1), and 
ETS-related transcription factor Elf-4 (ELF4). 

Protein levels in cluster 7 (Figure 2d) were associated more 
strongly with cellular sensitivity/ resistance to drug cytotoxicity as 
compared with drug-induced apoptosis. Drug-induced cytotoxicity 
is a broad phenotype that includes cellular processes such as 
necrosis, cell death through apoptotic and non-apoptotic path- 
ways, cell cycle arrest, and damaged cells undergoing DNA repair 
[37], whereas caspase 3/7 activation represents a specific process 
of cell death. 

Top protein quantitative trait locus implicates DID01 for 
paclitaxel-induced apoptosis 

Upon evaluation of all proteins with a genome-wide significant 
pQTL, we identified one protein that was also associated with 
paclitaxel-induced apoptosis. The trans pQTL on chromosome 16, 
rs6834, was significandy correlated (p = 2.66x10 15 ) with death 
inducer-obliterator 1 (DIDOl) protein levels (Figure 3a). DIDOl 
was in cluster 3 (Figure 2c), indicating that increased baseline levels 
conferred greater cellular sensitivity to both chemotherapeutic 
agents. DIDOl protein levels were significantly correlated with 
paclitaxel-induced apoptosis (p = 0.01 r 2 = 0.02; Figure 3b). How- 
ever, the DIDOl pQTL was not significantly associated with 
paclitaxel-induced apoptosis (p = 0.25, Figure 3c). Despite the lack 
of statistical significance (likely because of small sample size), the 
directionality was consistent with the observed protein relation- 
ship: cells containing two C alleles had lower levels of DIDOl and 
lower paclitaxel-induced caspase 3/7 activation. DIDOl mRNA 
levels were not associated with paclitaxel apoptosis (p>0.05), 
suggesting that this relationship was protein-specific. 

Using RNA interference, we performed gene knockdowns in 
YRI LCLs and examined the effect of knockdown on paclitaxel- 
induced cytotoxicity and apoptosis. Three different LCLs were 
nucleofected with siRNA against DIDOl. Although knockdown 
levels varied considerably, the maximal degrees of protein 
knockdown observed for 24 or 48 hours in 18522, 18853, and 
19192, were 20%, 48%, and 59%, respectively. When we pooled 
data from all cell lines and experiments using a MEM, knockdown 



of DIDO 1 resulted in a significant (p = 0.005) decrease in 
paclitaxel-induced caspase activity. On average, paclitaxel-in- 
duced apoptosis was decreased by 11.9% in cells following 
knockdown of DIDO 1 (Figure 3d). 

Enrichment of SNPs associated with chemotherapeutic 
phenotypes 

Using the pQTLs and eQTLs (unadjusted p<10~ ) from the 
genes included in our protein dataset, we evaluated enrichment 
with paclitaxel and cisplatin-induced cytotoxicity and apoptosis 
associated SNPs at unadjusted p<10 5 (Figure 4). For cisplatin, 
only the apoptosis phenotype demonstrated pQTL enrichment 
(p<0.001) (Figure 4a, 4b, left panels). Conversely, both paclitaxel 
phenotypes demonstrated pQTL enrichment (Figure 4c, 4d). 
When evaluating eQTLs, only cisplatin cytotoxicity showed 
enrichment for eQTLs (Figure 4b). However, when evaluating 
all expressed genes, eQTLs showed enrichment for all drugs and 
phenotypes except for cisplatin-induced apoptosis (data not 
shown). 

Utilizing pQTLs to identify proteins implicated in cisplatin 
and paclitaxel cellular response 

Using both cell growth inhibition and apoptosis as cellular 
phenotypes, we identified pQTLs (defined at p<10~ 4 ) associated 
with these phenotypes at p<0.001. From that overlap of pQTLs, 
we then analyzed the relationship between target protein levels 
and the respective drug phenotype (p^0.05) (Figure 5). Overlap- 
ping GWAS signals identified five proteins for cisplatin phenotypes 
and 21 proteins for paclitaxel phenotypes (Table S3). For each 
phenotype, we also identified individual lists of proteins-pQTL 
pairs that both associate with cisplatin or paclitaxel phenotypes 
(Table S4). For cisplatin GWAS, there were 79 pQTLs targeting 
27 proteins for cytotoxicity and 169 pQTLs targeting 27 proteins 
for apoptosis. For paclitaxel GWAS, there were 107 pQTLs 
targeting 38 proteins for cytotoxicity and 1 19 pQTLs targeting 42 
proteins for apoptosis. Interestingly, the protein SRC was 
implicated through all four phenotypes. 

We prioritized proteins for functional studies using the apoptosis 
relationship for paclitaxel and the cytotoxicity relationship for 
cisplatin. Among the five proteins whose baseline expression levels 
associated with cisplatin cytotoxicity and apoptosis, we found 
structural maintenance of chromosomes 1A (SMC1A) to have the 
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Figure 2. Establishing hierarchical clustering of baseline protein levels correlated with cisplatin and paclitaxel phenotypes. 

Hierarchical clustering was performed on 370 protein levels (rows) for 5 u,M cisplatin apoptosis and cytotoxicity and 12.5 nM paclitaxel apoptosis and 
cytotoxicity (columns). The correlation for each protein-drug phenotype pair is indicated with blue showing increased protein levels associating with 
greater sensitivity to the drug, red showing increased protein levels associating with resistance to the drug and white indicating no correlation (a). 
The number of significant clusters was determined by performing 1000 permutations of the column correlations, clustering them, and selecting the 
number of observed clusters at a tree height that significantly exceeded all tree heights from the permutations (k = 7, p<0.001). Clusters 1 (b) and 3 
(c) depict proteins that are related in the same direction to all cellular phenotypes. Cluster 7 (d) depicts proteins more related strongly to drug 
sensitivity through cytotoxicity than apoptosis. 
doi:1 0.1 371 /journal.pgen.1 0041 92.g002 



most significant relationship with cytotoxicity (p = 0.005, 
r = 0.039) (Figure 6a and 6b). We therefore selected it for further 
functional validation. SMC1A did not associate with either 
cisplatin phenotype at the mRNA level suggesting that this was 
a protein-specific relationship. Because more proteins were 
associated with paclitaxel-mediated apoptosis and cytotoxicity 
phenotypes, we prioritized functional follow-up based on a 
combination of p-values and q-values (to correct for multiple 
hypothesis testing). At p<0.005, five proteins were significantly 
associated with paclitaxel-induced apoptosis. Zinc finger protein 
569 (ZNF569) (Figure 6c, 6d) had the lowest association q value. 
At the mRNA level, ZNF569 had a weak correlation with 
paclitaxel-induced apoptosis (p = 0.04, r 2 = 0.06), but no relation- 
ship with paclitaxel-induced cytotoxicity. Table 2 lists the pQTLs 



that implicated SMC1A with the two cisplatin phenotypes and 
ZNF569 with the two paclitaxel phenotypes. We observed a 
different set of SNPs associated with each protein-drug pair that 
also associated with either apoptosis or cytotoxicity (Table 2). 

Because independent pQTLs associated with the drug-induced 
phenotypes, we functionally validated the relationship of these 
proteins with their respective drug-induced phenotypes. We 
selected three LCLs (18502, 19138,19201) with mid to high 
protein expression and performed siRNA nucleofection. We 
assessed knockdown at 24 and 48 hours post nucleofection. 
Knockdown of SMC1A protein levels varied across the cell lines; 
we did not observe more than 57%, 71%, and 62% protein 
knockdown for 18502, 19138, and 19201, respectively, for either 
time point. Using a MEM to examine the effect across cell lines, 



A B 

1 1 1 




GG CG CC 0.5 1.5 2.5 

Genotype n Paclitaxel-induced Log 2 Caspase Activity 




GG CG CC 19192 18522 18853 

Genotype 



Figure 3. Identification of a protein quantitative trait locus relevant for paclitaxel-induced apoptosis. On chromosome 16, rs6834 
genotypes were correlated with DID01 protein levels (p = 2.66x10~ 15 ) (a). DID01 protein levels were also significantly (p = 0.01) correlated with 
paclitaxel-induced apoptosis (b). The three shades of grey circles indicate data from each of the three thaws. Rs6834 was not significantly correlated 
with paclitaxel apoptosis (p>0.05); however, the CC individuals had both the lowest mean DID01 levels and lowest paclitaxel-induced apoptosis 
levels (c). Three LCLs were nucleofected with pooled DID01 or nontargeting control and apoptosis was measured 24 hrs after 12.5 nM paclitaxel (d). 
Mixed effect modeling revealed a significant (p = 0.005) reduction in caspase activity. 
doi:1 0.1 371 /journal.pgen.1 0041 92.g003 
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Figure 4. Evaluation of eQTL and pQTL enrichment associated 
with chemotherapeutic-induced phenotypes. Distribution of 
eQTL or pQTL count in 1000 permutations, each matching the MAF 
distribution of the chemotherapeutic drug sensitivity associated SNPs at 
p<10 -3 and controlling for linkage disequilibrium using recombination 
blocks. The observed number of pQTLs (left plots) and eQTLs (right 
plots) are shown relative to these background permutations for 
cisplatin apoptosis (a) and cytotoxicity (b) and paclitaxel apoptosis (c) 
and cytotoxicity (d). 
doi:1 0.1 371 /journal.pgen.1 0041 92.g004 

we determined that knockdown of SMC1A resulted in a 19% 
increase in apoptosis (p = 0.0002) and a 10.4% decrease in cell 
survival (p = 0.009) in response to cisplatin (Figure 7a). Knock- 
down of ZNF569 protein levels varied across the cell lines, but we 
observed no more than 45%, 58%, and 54% protein knockdown 
across 18502, 19138, and 19201, respectively, for either time 
point. Using a MEM to combine the effect across cell lines, 
knockdown of ZNF569 resulted in a 9.9% average reduction in 
apoptosis (p = 0.002) and a 26.8% increase in cell growth 
inhibition (p = 0.0001) (Figure 7b) in response to paclitaxel. 

Role of growth in protein-drug relationships 

Because growth rate has been previously identified as a heritable 
trait that is relevant in pharmacologic studies, we evaluated the 
relationship between steady state protein levels and intrinsic 
growth rate [38] for the proteins measured. Approximately 10% 
(45/441) of the proteins were correlated with growth at p<0.05 
(Table 3). Notably, SMC1A protein levels were significantly 



correlated with growth rate (p = 0.0007), whereas ZNF569 protein 
levels were not (p>0.05) (Figure S3). When we adjusted for growth 
rate, the association of SMC1A protein levels with cisplatin 
phenotypes was no longer significant (p>0.05). 

Discussion 

In this study, we evaluated 4,366 antibodies targeting 2,048 
unique proteins. From this set, we identified antibodies targeting 
441 protein isoforms expressed at baseline in LCLs and quantified 
them across three biological samples from 68 YRI LCLs. The use 
of multiple biological samples allowed us to implement mixed 
effects modeling to increase the robustness of our observations. 
Many protein expression levels were correlated with sensitivity to 
two cellular phenotypes (cytotoxicity and apoptosis) of two 
chemotherapeutic agents: paclitaxel and cisplatin. We validated 
one such finding through knockdown of DIDO 1 in three LCLs, 
which resulted in a decrease in paclitaxel-induced apoptosis. 
Quantitative trait loci for pharmacologic phenotypes were 
compared to quantitative trait loci for protein expression to better 
understand the functional significance of genetic variants contrib- 
uting to inter-individual variability in drug response. For each 
drug, we identified overlapping and unique sets of genetic variants 
associated with protein expression that were also correlated with 
drug-induced apoptosis and cytotoxicity. We further validated two 
such proteins through gene knockdown and concomitant modu- 
lation of cellular sensitivity to drug treatment: SMC1A levels were 
associated with resistance to cisplatin treatment, and ZNF569 
levels were associated with sensitivity to paclitaxel treatment. 

This study illustrates the utility of applying a highly-sensitive, 
novel, antibody-based technology to simultaneously measure 
many proteins across a large set of individuals. Using this method, 
we identified hundreds of novel genome loci that uniquely 
influence the expression of proteins that ultimately influence the 
sensitivity of cells to chemotherapeutic agents through both 
caspase 3/7 activation and other pathways leading to loss of cell 
viability. We evaluated protein expression in the International 
HapMap LCLs because these samples have previously been used 
for many studies relating genetics to gene expression [14,16,39] 
and cellular phenotypes [1], thus allowing us to perform 
comprehensive studies of genetics, protein expression, and 
pharmacology. LCLs are immortalized B-lymphocytes and, as a 
result, represent "non-cancerous" cells that may provide us with 
important protein targets for ameliorating bone marrow suppres- 
sion. However they also have some of the pathways relevant to 
anti-cancer drugs. We specifically chose the YRI population 
because of their greater genetic diversity relative to other 
populations. We expect that this data will have wide applicability 
to other genetic and pharmacological studies because of the 
important addition of protein levels to other studies. 

Whereas polymorphisms in coding regions that affect amino 
acid composition would seem to have the greatest effect on drug 
response, genetic variation that affects transcript abundance level 
has also been shown to affect drug response [25]. A dispropor- 
tionate number of drug response associated SNPs in a broad array 
of chemotherapeutic agents are eQTLs and are associated with the 
transcriptional expression level of multiple genes [25]. However, 
our work has demonstrated poor global correlations between inter- 
individual mRNA and protein levels (unpublished data). There- 
fore, functional annotation of pharmacologic SNPs and their 
relationships with proteins may result in important new discoveries 
as it has in this study. We note that 46,863 of the 121,484 trans 
pQTLs identified at P<10~ are also cis-acting eQTLs (within 
1 Mb upstream of the transcription start state to 1 Mb 
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Figure 5. Identification of common proteins associated with differing phenotypes through independent pQTL signals for cisplatin 
and paclitaxel. Genome-wide association results (p<10~ 3 ) on two cellular phenotypes (growth inhibition and apoptosis) for both cisplatin (a) and 
paclitaxel (b) were analyzed for pQTLs. All SNP that were pQTLs (p<10~ 4 ) had their target proteins evaluated for correlation with the drug phenotype 
(p£0.05). For each drug, the target protein overlap between cytotoxicity and apoptosis is indicated as the final number. The grey shading indicates 
the shift from numbers of variants to numbers of proteins and the SNPs presented in the grey area represent the number of SNPs targeting the 
proteins. 

doi:1 0.1 371 /journal.pgen.1 0041 92.g005 



downstream of the transcription end site) for at least one of the 
18,227 gene models quantified by RNA-Seq at P<0.05. This 
proportion (38.6%) is statistically enriched compared with the 
proportion of all single nucleotide variants genome-wide that are 
cis-eQTLs (36.6%, Fisher's exact test P<2.2xl0" 16 , odds 
ratio = 1 .09), suggesting that cis-acting may contribute to some 
extent to underlying trans-genetic regulation of protein levels. 

Because we performed multiple analyses to examine overlap 
and enrichment of protein and drug QTL, the p-value thresholds 
used in this study were more permissive relative to that typically 
used for genome-wide analyses. By contrast to various chemo- 
therapeutics that exhibit GWAS enrichment in eQTLs [25], 
paclitaxel GWAS results were not enriched in eQTLs; however, 
we identified enrichment in pQTLs for both paclitaxel-induced 
apoptosis and cytotoxicity phenotypes. Therefore, genetic variants 
associated with the level of a protein appear to be more important 
for sensitivity to this drug than mRNA regulatory variants. We 
functionally validated one of these observations, DIDOl, by 
siRNA knockdown. 



DIDOl is a tyrosine phosphorylated transcription factor that is 
localized to the nucleus [40] . DIDO 1 was also found within cluster 
3, which contained proteins with increased baseline levels 
correlating with greater cytotoxicity and apoptosis to each 
chemotherapeutic agent tested. DIDOl is generally believed to 
function through apoptosis-related processes; however, it has also 
been suggested to function in mitotic division based on gene 
overexpression in mice [41]. This proposed function provides a 
clear mechanistic connection to paclitaxel, a drug that kills cells 
through microtubule inhibition. 

Both paclitaxel and cisplatin have been in use for decades, and 
significant effort has been expended to identify strategies that 
result in increased tumor sensitivity to these agents, including 
targeting the activity of drug resistance pathways. However, this 
approach is only successful if the cancerous and non-cancerous 
cells differ in their response to modulation. Improving the 
therapeutic index for patients occurs if the "modulating agent" 
increases the sensitivity of chemotherapy in the tumor while 
decreasing toxicity in non-tumor tissues. This study offers an 
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Figure 6. Genetic variants relevant in chemotherapeutic response implicated through protein effects. For proteins implicated through 
SNPs in both apoptosis and cytotoxicity GWASes, mixed effect modeling was performed to measure the direction of the relationships. SMC1A, 
structural maintenance of chromosomes 1A, was positively associated (p = 0.0007) with 5 uM cisplatin induced apoptosis (a) and negatively 
associated (p = 0.004) with 5 uM cisplatin-induced cytotoxicity (b). ZNF569, zinc finger protein 569, was negatively associated (p = 0.0002) with 
12.5 nM paclitaxel-induced apoptosis (c) and positively associated (p = 0.0005) with 12.5 nM paclitaxel-induced cytotoxicity (d). All four plots display 
data from the three thaws represented by diamonds, triangles, or inverted triangles. 
doi:1 0.1 371 /journal.pgen.1 0041 92.g006 



opportunity to identify the relationship between transcription 
factors and signaling molecules and drug sensitivities in a non- 
tumor environment. For example, high levels of proteins 
identified in cluster 3 were associated with greater sensitivity to 
both cisplatin and paclitaxel; yet several of these proteins 
including c-Src [42,43] and c-Myc [44,45] have been shown to 
be overexpressed in tumor cells and their expression correlates 
with paclitaxel or cisplatin resistance. c-Src tyrosine kinase is 
overexpressed in a high proportion of ovarian cancers and 
ovarian cancer cell lines. Its inhibition, either pharmacologically 
or through gene knockdown, results in an increase in sensitivity 
of ovarian cancer cells to paclitaxel and cisplatin [43]. The 
increased cytotoxicity in response to c-Src inhibition was 
associated with a large increase in processing and activation of 
caspase-3. Our data support these proteins as potential drug 
targets, because reducing their levels in LCLs would result in 



lower sensitivity to the toxic effects of cisplatin and paclitaxel in 
contrast to cancerous cells. We anticipate that this dataset 
will therefore have great utility for the development of novel 
modulators of chemotherapy. 

Although LCLs are a more likely model for toxicity, we 
identified several relationships that have been recapitulated in 
tumor response. Signal transducer and activator of transcription 3 
(STAT3) had the strongest negative associations with cisplatin- 
and paclitaxel-induced apoptosis, suggesting high levels of STAT3 
protein conveyed drug resistance. STAT3 mRNA expression has 
previously been reported to be associated with cisplatin resistance 
in many cancer types, including head and neck [46], small cell 
lung carcinoma [47], and human epidermoid cancer cells [48], in 
which the CRE/ATF binding elements in the STAT3 promoter 
were shown to be important for mediating cisplatin resistance. 
STAT.3 mRNA expression has also been implicated in paclitaxel 
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Table 2. Protein QTLs implicating SMC1A for cisplatin and ZNF569 for paclitaxel. 
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Protein QTLs that also associate with either apoptosis or cytotoxicity are shown for each protein with bold indicating SMC1A and cisplatin and non-bold indicating 
ZNF569 and paclitaxel relationships. NA indicates that the p-value was greater than 0.001 for the drug phenotypes and 0.05 for the cis-eQTL associations. 
doi:1 0.1 371 /journal.pgen.1 0041 92.t002 



resistance. Knockdown of STAT3 conveyed sensitivity to paclitaxel 
in lung cancer cell lines [49]. STAT3 has been hypothesized as a 
potential target to modulate paclitaxel sensitivity in cancer patients 
[50]. PTEN is also an example of same direction of effect in LCLs 
and cancer cells, however unlike STAT3, increased levels of 



PTEN convey sensitivity. Recent studies have demonstrated that 
PTEN has the ability to enhance cancer cell sensitivity to 
particular anticancer agents. PTEN might reverse the chemore- 
sistance of human ovarian cancer cells to cisplatin through 
inactivation of the PI3K/AKT cell survival pathway and may 
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Figure 7. Functional validation of SMC1A and ZNF569. Three lymphoblastoid cell lines (18502, 19138, 19201) were evaluated 24 h and 48 h 
following 5 uM cisplatin treatment for cytotoxicity and apoptosis following nucleofection of pooled SIV1C1A and non-targeting control (a). Three 
lymphoblastoid cell lines (18502, 19138, 19201) were evaluated 24 h and 48 h following 12.5 nM paclitaxel treatment for cytotoxicity and apoptosis 
following nucleofection of pooled ZNF569 and non-targeting control (b). 
doi:1 0.1 371 /journal.pgen.1 0041 92.g007 
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Table 3. Proteins associated with growth. 



Protein Gene P-Value Rho 



At p<0.05, approximately 10% of the 441 proteins assayed correlate with cell 

growth including SMC1A. 

doi:1 0.1 371 /journal.pgen.1 0041 92.t003 



serve as a potential molecular target for the treatment of 
chemoresistant ovarian cancer [51]. 

SMC 1 A is part of the multi-protein cohesion complex required 
for sister chromatid cohesion. This cohesion complex has been 
shown to interact with the BRCA1 DNA repair protein and has 
been shown to be phosphorylated by ATM, a serine/threonine 
kinase activated by DNA double-strand breaks [52]. The cohesion 
complex has also been shown to be important for expression 
regulation and genomic stability [53]. Mutations in SMC1A have 
been shown to cause Cornelia de Lange syndrome, a multisystem 
developmental disorder with defects ranging from limb formations 
to cardiac, gastrointestinal, growth and cognitive systems [53]. 
Coding variants have also been identified in colon cancer [54] and 
have been implicated in impairing cellular response to toxic 
treatment [55]. Accumulated SMC1A protein has been linked to 
bortezomib-induced cell death, demonstrating its relevance for 
another chemotherapeutic agent [56], but this is the first study 
implicating SMC1A for cisplatin-induced cellular response. 
Recently, Wip 1 , an important signaling protein in cellular growth 
following DNA damage, has been identified as an upstream 
regulator of SMC 1 A [57], further suggesting an important role for 
this protein in cancer and chemotherapeutic response. SMC1A 
has also been linked to cellular growth rate and was identified 
within cluster one which included proteins whose levels were 
associated with reduced cytotoxicity and apoptosis phenotypes 
across both drugs. 

Another protein we functionally validated associated with 
paclitaxel, ZNF569, was a notable candidate because it has been 
functionally implicated as a transcriptional repressor that sup- 
presses MAPK signaling [58]. Because of the importance of 
MAPK signaling in breast cancer [59] and the common use of 
paclitaxel as a breast cancer therapy [60] , this association presents 
an interesting biological mechanism and potential therapeutic 
marker. ZNF569 is supported in our data as a transcriptional 
suppressor of MAPK signaling, because lower ZNF569 protein 
levels were correlated with increased cellular survival. In addition, 
ZNF569 was also found in the cluster of proteins that negatively 
correlated more strongly with cytotoxicity than apoptosis for both 
drugs, perhaps indicating a role for ZNF569 in cell growth 
inhibition unrelated to caspase 3/7 activation. 

Notably, this study focused on two widely used but mechanis- 
tically distinct agents. By examining two distinct cell phenotypes, 
cell growth inhibition and caspase 3/7 activation, our study 
identified proteins associated with different cell signaling pathways 
responsible for cell growth inhibition. Although our study did not 
reveal candidates with strikingly high effect sizes that were 
predictive of drug sensitivity, it revealed many unique proteins 
whose expression levels were correlated with phenotypic measure- 
ments for a single drug. This observation is consistent with 
multiple proteins contributing small influences to drug sensitivity. 
The protein data collected in this study allowed us to gain a new 
understanding of the potential mechanisms and pathways relevant 
for cell viability and the genetic variants regulating those proteins. 

Interpreting GWAS results continues to present challenges; 
increasingly, eQTL studies are being used to inform [25,61,62] 
interpretation of these results and are the focus of expanded studies 
to understand biological mechanisms [63,64]. These association 
tests have been extended to other functional units in the genome 
from microRNAs [20] to DNA hypersensitivity sites [19] and 
modified cytosines [17]. The main factor limiting the inclusion of 
proteins in GWAS studies has been the lack of a reliable, high- 
throughput methodology to quantify them across populations of 
individuals. The approach described in this study, including the 
newly developed microwestern array [32], has started to bridge 
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that technological gap [33], and this study demonstrates the utility 
of targeted protein-omic datasets to understand cellular pheno- 
types and genomic studies. 

Materials and Methods 

Cell lines 

YRI LCLs derived from unrelated individuals from the 
population residing in Ibadan, Nigeria (n = 68) were chosen for 
consistency with publicly available mRNA expression data on a 
single population [16]. LCLs were cultured in RPMI 1640 media 
containing 20 mM L-glutamine and either 15% fetal bovine 
serum (Hyclone, Logan, UT) for baseline protein quantification, 
cisplatin and paclitaxel apoptosis and cisplatin cytotoxicity 
experiments or bovine growth serum (Hyclone, Logan, UT) for 
paclitaxel cytotoxicity experiments. Cell lines were diluted three 
times per week at a concentration of 300,000-350,000 cells/mL 
and maintained in a 37°C, 5% CO z humidified incubator. 
Medium and components were purchased from Cellgro (Herndon, 
VA). 

Drug-induced cell apoptosis and cytotoxicity phenotypes 

Drug-induced apoptosis and cytotoxicity phenotypes were 
determined at 5 |J,M cisplatin and 12.5 nM paclitaxel. Both drugs 
were prepared as described previously: cisplatin [65] and 
paclitaxel [66] . The cytotoxic effect of cisplatin [65] and paclitaxel 
[66] was determined using a short-term cellular growth inhibition 
assay, and the apoptotic effect was measured using a caspase 3/7 
activity detection reagent Caspase-Glo 3/7 (Promega Corpora- 
tion, Madison, WI). 

Protein isolation 

Three independent thaws constituting biological replicates of 68 
unrelated YRI cell lines were propagated and pelleted (5.1 million 
cells per pellet). Cells were spun at 400 RPM, aspirated, and 
washed in ice-cold PBS. This process was repeated twice and then 
the pellets snap frozen in liquid nitrogen and placed at —80 
degrees. Total protein was extracted by re-suspension in 1 .0 mL of 
1.5% SDS lysis buffer (240 mM Tris-acetate, 1.5% w/v SDS, 
0.5% w/v glycerol, 5 mM EDTA) containing 50 mM DTT, 
protease inhibitors (1 |J,g/ mL aprotinin, 1 [ig/ mL leupeptin, 1 ug/ 
mL pepstatin), and phosphatase inhibitors (1 mM sodium 
orthovanadate, 10 mM (3-glycerophosphate). To ensure complete 
protein denaturation, samples were boiled for 10 min, sonicated 
for 10 min (alternating 30 s on, 30 s off) with a Bioruptor 
(Diagenode), and concentrated to 5-10 ug/uL using a 96-well 
micro-concentration device with a 10 kDa molecular weight cutoff 
(Millipore). 

Pilot study 

To identify sources of steady-state protein expression variation, 
we performed a pilot study to quantify 21 proteins across three 
independent cultures from two independent thaws from two YRI 
LCLs (NA18861 and NA19193). We performed a multifactorial 
ANOVA to assess the proportion of protein expression variation 
explained for each of these variables across all proteins. We 
observed a significant thaw effect explaining 3.75% of global 
protein expression variation (p = 0.01, Ftest), whereas culture only 
explained 0.13% of protein expression variation (p = 0.85, .Ftest). 
Using a mixed-effects model with a random nested effect, 
(1 | individual/thaw/ culture), only 2.71x10 14 % of protein ex- 
pression variation was between cultures within thaws, whereas 
5.29% of variation was between thaws within individuals. 



Protein quantification and analysis with pharmacologic 
phenotypes 

Initially, three biological replicates for each of 1 1-12 individuals 
were pooled together into six pools for screening 4,366 rabbit 
polyclonal antibodies at a 1: 1000 dilution. Printing, gel fabrication, 
horizontal semidry electrophoresis, transfer, blotting, and scanning 
were performed as in Ciaccio et al. [32], permitting 96 antibodies 
to be screened over six pooled lysates per MWA. The 4,366 
antibodies were directed against 1,848 unique TFs and 200 unique 
cell signaling proteins. Of this set, 198 antibodies producing a 
single predominant band the size of the targeted protein isoform of 
interest with a signal-to-noise ratio ^3 were selected for 
subsequent population-level quantification by RPPAs; antibodies 
that displayed at least one band the size of the targeted protein 
isoform of interest with a signal to noise ratio & 3 but additional 
bands were selected for subsequent population-level quantification 
by MWAs. This approach ultimately allowed us to quantify 
protein levels from 441 antibodies (341 TF and 100 signaling) 
directed at 391 unique protein isoforms across three biological 
replicates of 68 LCLs. Additional antibody details are listed in 
Table S5. 

For RPPA quantification, four technical replicates of each of 
three biological replicates of all 68 individuals were spotted using a 
noncontact piezoelectric microarrayer (GeSiM Nanoplotter 2. IE) 
onto nitrocellulose membranes (Biorad). Serial dilutions of each of 
the six pooled lysates used for the original antibody screen and an 
A431 skin carcinoma cell line control were also printed for each 
array to ensure the linearity and quality of the antibody signal. 
Features with background-subtracted integrated intensities <0 or 
signal to noise ratio <3 (Z test p>0.05) were identified in each 
array and excluded from further analysis. The distributions of 
background-corrected integrated intensities for all features on each 
array were first log 2 -quantile normalized using the limma package 
in R to correct for overall antibody hybridization efficiency 
differences in the signal. The relative expression of a given protein 
for a sample was then quantified using the linear model (1) 
yjp~ fij p + Aj + e, where Hjp is the log-quantile normalized, 
background-corrected integrated intensity of sample j on array p, 
~kj is the effect due to sample j across all arrays in a print (due to 
differing amounts of total protein spotted on the array for each 
sample), estimated by median^). Odyssey output text files were 
parsed in Python and quantified and normalized in R. 

For micro-western quantification, three technical replicates of 
each of the three biological replicates of all 68 individuals were 
spotted as above onto polyacrylamide gels. Gel fabrication, 
horizontal semidry electrophoresis, transfer, and scanning were 
performed as in Ciaccio et al. [32] with the exception of separating 
each blot into four quadrants rather than using a 96-well gasket to 
permit 68 x3 = 204 samples to be quantified with a single antibody 
on a single quadrant. Feature extraction and data normalization 
were performed as with RPPAs. For antibodies that produced 
multiple bands (signal to noise ratio >3), all isoforms were 
quantified and their relative molecular weights recorded. The 
expression of a given protein for an individual was quantified using 
the above linear model (equation 1) with the addition of a batch 
term (fi) to correct for global intensity distribution differences 
across multiple microwesterns for the same antibody. For 
replicates within platforms for the same antibody across the entire 
population, we took the average of the expression measurements. 
For replicates across platforms, we selected the platform that 
yielded the highest median background-corrected integrated 
intensity. To reduce the inflated effect of technical noise because 
of low antibody signals and provide more accurate inter-individual 
protein expression measurements, antibodies in the bottom deciles 
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of median background-corrected integrated intensities or in the 
top deciles of technical CVs for either platform were flagged and 
eliminated from further comparative analyses. 

For each protein measurement from either method, we 
constructed linear mixed effects models y~p+ C+ T\I + e, in 
which p is the array- and sample-load normalized integrated 
signal intensity for all biological replicates of all individuals 
comprising the population, C is the fixed effect of the drug, T\ I is 
the random thaw effect per individual, and e is the residual error. 
The model was fitted to each protein by residual maximum 
likelihood using the lmer function in the R package lme4 (v 
0.999999-0). This mixed effect model incorporates the direction of 
effect for each biological replicate and insures that those with 
conflicting directions would result in a less significant p-value. 
Fixed effect p-values for covariates were estimated using the 
pamer.fhc function in the LMERConvenienceFunctions 
package (v 1.6.8.3). The significance of covariate effects was 
assessed by estimating false discovery rates using Storey's q-value 
method. 

Hierarchical Clustering. Hierarchical clustering was performed in R 
using Euclidean distance and the Ward method in hclustO for the 
standardized coefficients between the regression of 370 protein 
isoforms (rows) by the four drug phenotypes (columns), with the 
apoptosis coefficients inverted to match directionality with the 
cytotoxicity coefficients. The number of significant clusters was 
determined by performing 1000 permutations of the column 
coefficients, clustering them, and selecting the number of observed 
clusters at a tree height that significandy exceeded all tree heights 
from the permutations (k= 7, p<0.001). 

Genome-wide association studies 

HapMap genotypes were obtained from the 1000 genomes, 
June 20 1 1 , phase I, low-pass whole genome SNP genotype release 
(www.1000genomes.org). Missing values were imputed by BIM- 
BAM (v 1.0) using the default parameters to derive mean imputed 
genotypes. SNPs with MAF<0.05 and SNPs with significant 
deviation from Hardy- Weinberg equilibrium (Fischer's exact test, 
p<0.001) were excluded, reducing the set to 9,345,571 SNPs and 
indels for association analyses. To ensure that low MAF SNPs 
were not generating spurious associations due to outiiers, we 
compared the MAF distribution of SNPs associated with protein 
and drug phenotypes with all SNPs (Figure S4). The average 
MAFs for protein (.17) and drug (.15) associations do not show a 
bias as compared with the genome (.16). Each protein expression 
measurement was inverse normal transformed prior to association 
analysis. Drug-induced cytotoxicity phenotypes were log-trans- 
formed to better approximate normal distributions. We tested for 
normality using the Shapiro-Wilk test and none of the drug 
phenotypes deviated significantly from normality (p>0.001). We 
selected this threshold because of the smaller sample size and also 
examined the frequency distribution to ensure that outliers were 
not substantially driving false positive associations. Protein 
expression and drug phenotypes were then tested for association 
with all markers genome-wide by linear regression implemented in 
Python and R using custom scripts. For each phenotype, we 
selected the most significandy associated SNV within each 
recombination window, defined by splitting the genome into 
25,307 blocks flanked by >10 cM/Mb recombination rates 
estimated from HapMap. 

Enrichment analysis 

For each drug, we generated 1,000 randomly selected sets of 
SNPs of the same size and matching the same MAF distribution as 
all SNPs significantly associated with that drug (dQTLs) at 



p<10 J and examined the overlap of these dQTLs with pQTLs 
and eQTLs at p<10~ 4 , as previously described [25]. We 
empirically determined the enrichment p-value by comparing 
the observed dQTL-pQTL or dQTL-eQTL SNP overlap to the 
null distribution. We also evaluated enrichment of dQTLs at 
p< 1 0 4 for the SNP-transcript association to test the robustness of 
an enrichment result to the choice of p-value threshold. To 
investigate whether the observed enrichment of dQTLs to be 
pQTLs or eQTLs was driven by linkage disequilibrium, we 
performed an additional simulation analysis after selecting only the 
most significant dQTLs for each recombination block. 

siRNA nucleofection 

LCLs were seeded at a density of 550,000 cells/mL 24 hours 
before nucleofection. Amaxa's Cell Line 96-well Nucleofector Kit 
SF (Lonza Inc, Basel, Switzerland) was used to perform the 
transfection. Cells were centrifuged at 90 g for 10 minutes at room 
temperature and resuspended at a concentration of 1,000,000 cells 
in 20 |XL of SF/supplement solution (included in SF Kit Lonza 
Catalog #V4SC2096) and 2 uM final concentration of AllStars 
negative Control siRNA labeled with AlexaFluor488 (Qiagen Inc., 
Valencia, CA) or a pool of siRNA (Qiagen) (See Table SI). The 
cells were nucleofected using Amaxa's DN-100 program. Cells 
were allowed to rest for 10 minutes before the addition of pre- 
warmed (in 37° water bath for a minimum of 20 minutes) RPMI 
media and then another 5 minutes after the addition of warm 
RPMI media. Cells were then plated for protein measurements 
and drug treatments. Cells were harvested at 24 and 48 hours 
post-nucleofection for protein measurement. Drug treatment was 
done 1 8 hours following transfection for cell survival measurement 
and 24 hours after transfection for apoptosis measurement. 
Apoptosis was measured as described above, whereas cell survival 
was measured as described above for cisplatin and using Cell-Titer 
Glo (Promega) for paclitaxel. Each experiment was done twice, 
with two independent transfections. 

siRNA analyses 

To assess the size and significance of the effect of siRNA 
knockdown on drug response (survival for cytotoxicity assay and 
caspase activity for apoptosis assay) we fit the following linear 
mixed effect model: response ~ knockdown + dose + ( 1 1 id) + 
(\\experiment) , in which knockdown is 1 if the gene was knocked 
down and 0 if scrambled. Cell line id (denoted by id) and experiment 
were used as random effects to properly account for correlation 
between replicates. To increase precision, we pooled the data from 
all cell lines. The mixed effects model was fit using lme4 package in 
the R Statistical package (http://cran.r-project.org/). The good- 
ness-of-fit of the model was assessed by examining the residuals. 
Normality of the residuals was assessed using the Shapiro-Wilk test 
in the R Statistical package. Log-transformation of the response 
variable was used to achieve approximate normality. 

Supporting Information 

Figure SI Cellular growth inhibition is inversely correlated with 
caspase 3/7 apoptosis measurements. In 68 unrelated YRI LCLs, 
both cisplatin (a) and paclitaxel (b) cytotoxic phenotypes were 
negatively correlated with apoptosis measurements. Paclitaxel's (c) 
correlation (r 2 = 0.35) was much stronger than cisplatin's (d) 
correlation (r = 0.04). 
(TIF) 

Figure S2 Correlation of cellular phenotypes across thaw. We 
correlated the cellular phenotypes for cisplatin cytotoxicity (a) and 
apoptosis (b) and paclitaxel cytotoxicity (c) and apoptosis (d) using 
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63 cell lines for cytotoxicity and 21 for apoptosis. Both cytotoxicity 
phenotypes were correlated p<0.0001 with r 2 of .28 (a) and .35 (c). 
Apoptosis phenotypes were correlated p<0.003 with r 2 of .63 (b) 
and .38 (d). 
(TIF) 

Figure S3 Role of growth in proteins associated with chemo- 
therapeutic phenotypes. SMC1A protein levels (a) are significandy 
correlated with intrinsic growth rate (p = 0.001) whereas ZNF569 
(b) was not (p>0.05). 
(TIF) 

Figure S4 Minor allele frequency distribution comparing 
associated SNPs with all SNPs. SNPs associated with proteins' 
MAF distribution (middle) contained more common MAF variants 
than all SNPs tested genome-wide (left, two-sample Kolmogorov- 
Smirnov test p= 1, pQTL median MAF = 0.1 7 vs. genome-wide 
median MAF = 0.16). However, we appreciate that low MAF 
variants are statistically more prevalent in our dQTL associations 
(right) (C, K-S test p<0.05) but by not a large magnitude (dQTL 
median MAF = 0.1 5). 
(TIF) 

Table SI siRNA used in functional experiments. The siRNAs 
that were purchased from Qiagen and pooled are listed for each 
gene indicated. The asterisk indicates that the siRNA was 
functionally validated to the target gene by Qiagen. 
(DOC) 

Table S2 Relationship between drug phenotypes and proteins. 
The mixed effect model (MEM) p-value and beta for all nominal 
(p<0.05) correlations between each phenotype (5 uM cisplatin 
apoptosis, 5 pM cisplatin cytotoxicity, 12.5 nM paclitaxel apop- 
tosis, 12.5 nM paclitaxel cytotoxicity) and proteins are listed. 
(XLSX) 
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